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A METHOD AND A SYSTEM FOR A DATA PROCESSOR 



TECHNICAL FIELD 

S The invention refers to a method and a processing system for a communications 

network, according to the non-characterizing portions of claim 1 and 8, respectively. 



BACKGROUND 



10 In processor technology, such as data packet processing technology, more specifi- 
cally in the entering of instructions for the processes, traditional linker algorithms 
uses large chunks of code, i.e. machine code chunks from an assembler. Traditional 
linkers also have an object file where the code chunk is stored along with relocation 
objects. The linkers place many chunks of code sequentially in a memory and link 

15 the chunks of code together using of the relocation objects. The codes are optimized 
by memory utilization, i.e. all codes are placed in sequence. 

A disadvantage with known processor assembly-linking algorithms is that it is diffi- 
cult to meet processing requirements of die processor during programming and 
20 compiling. More specifically, it is difficult to include real time requirements of the 
data processing, when programming and compiling using traditional linker algo- 
rithms. 



SUMMARY 

' 25 

: It is an object of the present invention to present a method and a processing system 

\ for a communications network, at which it is easier to meet processing requirements 

: : : of data processes in the network, during implementation of instructions for the proc- 

esses. 



30 
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It is also an object of the present invention to present a method and a processing sy- 
stem for a communications network, at which it is easier to take under consideration 
real time requirements of the data processing in the network, during programming 
and compiling of instructions for the processes. 

5 

The objects are achieved by a method and a processing system for a communica- 
tions network, according to the characterizing portions of claim 1 and 8, respec- 
tively. 



10 Dividing the program code into a plurality of sequences, defining, based on the pro- 
gram code, a plurality of relocation objects, each corresponding to a dependency 
relationship between two or more of the sequences, and allocating the sequences to 
a processor instruction memory, provides a structure of the codes that make them 
easy to manipulate in order to meet data processing requirement of the communica- 

15 tions network. 

Preferably, at least one directed graph is formed, based on at least some of the se- 
quences and at least some of the relocation objects, and a longest execution path 
through the directed graph is determined. Sequences in the instruction memory can 

20 be moved and state preserving operations can be entered, so as to make at least two 
execution paths equally long. This provides an effective tool for controlling the code 
in order to meet real time requirements in the communication network. More spe- 
cifically, the invention facilitates, as opposed to known processor assembly-linking 
algorithms, often designed to optimize memory utilization, the determination of the 

25 total execution time for each alternative execution path, avoiding difficulties in 
meeting processing time requirements. 



88/02/2002 17:25 ALB I HNS STOCKHOLM RB * 0668386 N*.4?4 004 

• 08 59887300 
Ink. t. Patent- och reperilV 

2002 -u^ - u 8 

3 

Hurudfown Kosson 

BRIEF DESCRIPTION OF FIGURES 



Below, the invention will be described in detail, with reference to the accompanying 
drawings, in which 

5 - fig. 1 shows schematically the structure of an instruction memory, according to a 
preferred embodiment of the invention, in relation to programmed instructions 
and a data path, 

- fig. 2 shows schematically the instruction memory in fig. 1 and a part of it en- 
larged, 

10 - fig. 3 shows schematically die structure of a program according to a preferred 
embodiment of the invention, and 

- fig. 4-7 show examples of program structures, 



DETAILED DESCRIPTION 

15 

Here, reference is made to fig, 1, The processing system according to the invention 
is adapted to process physical data passing through a processing pipeline 1 . The data 
can be in the form of data packets. The direction of the data stream is indicated by 
an arrow A. The processing system can be a PISC (Packet Instruction Set Com- 
20 puter), as described in die patent application SE0100221-1 . The processing pipeline 
1 comprises a plurality of pipeline elements 2. Between die pipeline elements 2 en- 
gine access points 3 are located, at which, for example, data packet classification 
can take place. 

A : 25 The processing pipeline 1 can be adapted to a so called classify-action model. 

. : . Thereby, in the processing of a data packet, first a classification is performed, e.g. a 

. I CAM lookup. A matching classification rule can start an action with an argument. 

The action can be a process program executed in a pipeline element 2 and that per- 
* " : forms a task as a directed graph of sequences starting with a root sequence, as de- 

m m m 

[ 30 scribed below. Hie processing system can be adapted to run many forwarding plane 
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applicatioos simultaneously, e.g. both MPLS^tfff^^^Se^pplications can be 
separated logically, for example by special tags for classification and different sets 
of actions (packet programs). 

5 The processing system, comprises an instruction memory 4 for storing instructions 
for the data processing. The instruction memory 4 comprises rows S and columns 6, 
Each pipeline element 2 comprises a number of instruction steps for the data proc- 
ess, whereby each instruction step is allocated to a column 6 in the instruction mem- 
ory 4. Accordingly, each pipeline element 2 is allocated a certain number of col- 
10 umns as indicated by the broken lines B. The rows 5 each corresponds to an address 
in the memory 4. 

In a method according to a preferred embodiment of the invention, described closer 
below, an assembler generates sequences 7 of instruction words 8, preferably ma- 
IS chine code instruction words, e.g. VLIW instruction words, and a linker place the 
sequences into the instruction memory 4 and link the sequences together. Referring 
to fig. 2, in the memory 4, each sequence occupies memory space, in the same row 
and in adjacent columns. Each sequence is a row of machine code instruction words 
that will be executed consecutive. After executing an instruction word the processor 
20 will normally execute another instruction word in the next column at the same row. 



As depicted in fig. 1, the processing system comprises a process program, which 
corresponds to a directed graph 9 of sequences 7 that each perform a certain task on 
data, e.g. data packets, such as forwarding. 

.0 25 

♦ 
# 

: As depicted in fig. 3, the directed graph starts with a root sequence 7a. It is limited 

. : to one pipeline element 2. Each root sequence 7a is a sequence at the beginning of a 

: I m process program, and can be marked as "Root" in the program. A root sequence 7a 
can be started as a result of a CAM classification or an instruction in the form of a 

• « m 

' 30 jump in a sequence in a preceding pipeline element 2. The linker can export a start 



88/02/2002 17:25 fiLBIHNS STOCKHOLM AB * 06&32BS NR.474 GfcG 

ea 59887300 \ n \c t P^.v octi teg.vetWt 
2002 ^ u 8 

row of ihe root sequence so that a run-time software can map action to classification 
rules. 



The directed graph ends in leaf sequences 7b. Each leaf sequence 7b is a sequence 
5 that ends with a relocation instruction that the packet program should exit or jump to 
a sequence in another pipeline element 2. Thus, the linker can link directed graphs, 
or programs, in different pipeline elements 2, by connecting, or linking, a leaf se- 
quence in one program to a root sequence in another program. 

10 The directed graph 9 comprises branches 10. Bach branch 10 is a relocation object 
that provides information that there is an alternative sequence to jump to ax the in- 
struction at which the branch is located. By using a branch 10 the processor can per- 
form a jump to another sequence at another row, but a default branch is executed at 
the same row and belongs to the same sequence. 

15 

A sequence exit can be any of the following: A relocation object instructing a jump 
to another sequence in the same pipeline element 2, a relocation object instructing a 
jump to a sequence in a following pipeline element 2, or an instruction to end the 
program. Any of the two latter alternatives form an exit in a leaf sequence 7b. 

20 

The relocation objects result in the process program having a finite number of alter- 
native paths until the program exits in one or many exit points (leafs). 

In a method for a communications network according to a preferred embodiment of 

» • 

\\ : 25 the invention, the assembler receives a program code, comprising a plurality of in- 

: structions for the communications network. The assembler divides the program code 

* • » 

m „ : into a plurality of sequences 7, e.g. sequences of a P1SC code, and defines, based on 
... 

: : : the program code, a plurality of relocation objects 10, each corresponding to a de« 

♦ • 

1- : : pendency relationship between two or more of the sequences 7. The assembler 
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forms at least one directed graph 9, based on at least some of the sequences 7 and at 
least some of the relocation objects 10, the directed graph having one or many roots. 

In other words the assembler or "compiler" divides the code into "atomic" tasks, ac- 
S tivities or sequences that have dependencies to each other such that they need to be 
performed in a consequence order. 

The directed graph 9 is stored as an object file. The object file comprises a code se- 
quence format and a relocation object format that define the dependencies between 
10 . the sequences. Each code sequence has a length (a number of instructions). 

The directed graph is analyzed. The linker validates that the directed graph consists 
of one to many partial ordered sets. This means determining the existence of any 
circle reference by any of the relocation objects 10 between any of the sequences 7. 

IS 

Preferably, the method comprises the linker validating that the code meet a prede- 
termined time requirement, i.e. hard execution time requirement. This is done with a 
longest path algorithm over the directed graph, i.e. determining a longest execution 
path through the directed graph, which includes adding sequence lengths or lengths 
20 of partial sequences. In the pipeline processing case the execution time requirement 
is limited by the number processing stages. In a more general case the time limit can 
be any hard real-time requirement. 

Hie linker places the sequences in the instruction memory, as described above. The 
: Y: 25 two-dimensional instruction memory 4 allows many alternative execution paths to 
: be stored in parallel. Hence, the program having to be stored as a directed graph. 



30 



Referring to fig. 4, the linker moves at least one sequence in the instruction memory 
and allocates at least one state preserving operation, or no operation instruction, 
NOP in die instruction memory, so as to make at least two execution path equally 
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long, whereby the length of the at least two execution paths correspond to the long- 
est execution path. As an example, in fig. 4 the longest execution is determined by 
the root sequence 7a, a first relocation object 1 01 and a first leaf sequence 71. An 
alternative execution path is formed by a part of die root sequence 7a, a second relo- 
S cation object 102 and a second leaf sequence 72. The alternative execution path has 
a shorter execution time compared to the longest execution path, due to the second 
relocation object 102 being located closer to the root of the root sequence 7a than 
the first relocation object 101 , and the second leaf sequence 72 being shorter than 
the first leaf sequence. The linker moves die second leaf sequence and enters state 
1 0 preserving operations NOP before and after the second leaf sequence 72. Alterna- 
tively, the second leaf sequence is not moved and state preserving operations NOP 
are entered after second leaf sequence 72. Thereby, all alternative execution paths 
become equally long, and the execution time is equal to the longest path for all pos- 
sible alternative paths. 

IS 

Referring to ftg. 5, in an alternative embodiment, a special no operation row 12 in 
the memory 4 is used by the linker, whereby leaf sequences 7b in execution paths 
being shorter than the longest execution path, can jump by means of a relocation 
object 10, to the to the no operation row 12 when finished. It is important that said 
20 jump is earned out to the correct position in the no operation row 12 so that execu- 
tion paths become equally long. 



Fig. 6 depicts a situation where alternative execution paths are to be synchronized to 
a shared sequence 7c. The linker moves a sequence 73 in a path being shorter than 
25 the longest execution path, and enters state preserving operations NOP before and 
after the moved sequence 73. 

Fig. 7 depicts an alternative to the method described with reference to fig. 6. The 
linked can add a no operation sequence 14 on the same row as, but before the shared 
30 sequence 7c. A relocation object 103 is entered to make the shorter execution path 
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jump to the no operation sequence 14 before entering the shared sequence 7c. This 
provides for an easier programming of the processor, since any execution path being 
shorter than the longest one, can be extended to correspond the latter simply by en- 
tering a relocation object at its end pointing to the defined no operation sequence 14, 
5 or, referring to fig. 5, the no operation row 12. 

The invention guarantees that the packet program maintains its state to die next en- 
gine application point (EAP) even after the packet program has exit. The linker can 
use a super root with the length of zero instruction as the root for all packet pro- 
10 grams, which makes the graph operations more easy. 



As has been described the invention provides for the sequences to be moved and the 
state preserving operations to be entered in such way that sequences that are de- 
pendent on each other are synchronized. 

15 

Above, the invention has been described pointing out its usefulness for a program 
code for a pipelined processing system, such as a PISC processor. However, the in- 
vention is suited for program codes for any hard real-time system, e.g. on a tradi- 
tional processor, where the exact execution time is critical . 
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1 . A method for a communications network, comprising the step of receiving a 
program code, comprising a plurality of instructions for the communications 

5 network, characterized by the steps of 

- dividing the program code into a plurality of sequences (7), 

- defining, based on the program code, a plurality of relocation objects (10), each 
corresponding to a dependency relationship between two or more of the se- 
quences (7), and 

1 0 - allocating the sequences (7) to a processor instruction memory (4). 

2. A method according to claim 1, comprising the steps of forming at least one di- 
rected graph, based on at least some of the sequences (7) and at least some of die 
relocation objects (10), and determining a longest execution path through the di- 

15 rected graph. 



3 . A method according to claim 2, comprising the step of entering at least one state 
preserving operation (NOP) in die instruction memory (4), so as to make at least 
two execution paths equally long. 

20 

4. A method according to claim 3, comprising the step of moving at least one se- 
quence in the instruction memory (4). 

5. A method according to claim 3 or 4, wherein the length of the at least two exe- 
25 cution paths correspond to the longest execution path. 

* - 

r l [ 6. A method according to any of the preceding claims, comprising the step of de- 

termining the existence of any circle reference by any of the relocation objects 
(10) between any of the sequences (7). 



30 
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7. A method according to any of the preceding claims, comprising the step of link- 
ing at least one sequence, obtained by the step of dividing die program code, to a 
sequence, obtained by dividing another program code. 

8. A processing system for a communications network, comprising an assembler 
adapted to receive a program code, comprising a plurality of instructions for the 
communications network, characterized by the assembler being adapted to di- 
vide the program code into a plurality of sequences (7), and define, based on the 
program code, a plurality of relocation objects (10), each corresponding to a de- 
pendency relationship between two or more of the sequences (7), and a linker 
being adapted to allocate the sequences (7) to a processor instruction memory 
(*). 

9. A processing system according to claim 8, wherein the assembler is adapted to 
form at least one directed graph, based on at least some of the sequences (7) and 
at least some of the relocation objects (10), and the linker is adapted to determine 
a longest execution path through the directed graph. 



10. A processing system according to claim 9, wherein the linker is adapted to enter 
at least one state preserving operation (NOP) in the instruction memory (4), so as 
to make at least two execution paths equally long. 

1 1 .A processing system according to claim 10, wherein the linker is adapted to 
move at least one sequence in the instruction memory (4). 

12. A processing system according to claim 10 or 1 1, wherein the length of the at 
least two execution paths correspond to the longest execution path. 



08/02/2002 



17:25 



ALB I HNS STOCKHOLM AB -> ^560286 



NR. 474 




13. A processing system according to any of the claims & to 12, wherein the linker is 
adapted to determine the existence of any circle reference by any of the reloca- 
tion objects (10) between any of the sequences (7). 

5 14. A processing system according to any of the claims 8 to 13, wherein the linker is 
adapted to link at least one sequence, obtained by dividing the program code, to 
a sequence, obtained by dividing another program code. 
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The invention refers to a method and a processing system for a communications 
network. The method comprises the step of receiving a program code, comprising a 
S plurality of instructions for the communications network, dividing the program code 
into a plurality of sequences (7), defining, based on the program code, a plurality of 
relocation objects (10), each corresponding to a dependency relationship between 
two or more of the sequences (7), and allocating the sequences (7) to a processor in- 
struction memory (4). Preferably, at leasi one directed graph is formed, based on at 
10 least some of the sequences (7) and at least some of the relocation objects (10), and 
a longest execution path through die directed graph is determined. Sequences (7) in 
the instruction memory (4) can be moved and state preserving operations (NOP) can 
be entered, so as to make at least two execution paths equally long. 
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